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(54) Methoca and apparatus for determining feature points 



(57) A method for determining feature points com- 
prises a step for (a) providing directional gradients and 
a gradient magnitude for each pixel in the video frame, 
(b) normalizing the directional gradients by dividing the 
directional gradients with the gradient magnitude, (c) 
generating a first edge map having the gradient magni- 
tude for each pixel, (d) generating a second edge map 
having the normalized direction gradients for each pixel, 
(e) dividing the first edge map into a plurality of blocks of 
an identical size, (f) providing, for each of the pixels 



included in each of the iDlocks, normalized directional 
gradients for a set of a predetermined number of pixels 
from the second edge map. (g) obtaining a variance for 
each of the pixels included in each of the blocks based 
on the provided normalized directional gradients, (h) 
determining a feature point for each of the blocks based 
on the gradient magnitude and variance corresponding 
to each of the pixels therein. 
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(54) Method and apparatus for determining feature points 



(57) A method for determining feature points com- 
prises a step for (a) providing directional gradients and 
a gradient magnitude for each pixel in the video frame, 
(b) normalizing the directional gradients by dividing the 
directional gradients with the gradient magnitude, (c) 
generating a first edge map having the gradient magni- 
tude for each pixel, (d) generating a second edge map 
having the normalized direction gradients for each pixel, 
(e) dividing the first edge map into a plurality of blocks of 
an identical size, (0 providing, for each of the pixels 



included in each of the blocks, normalized directional 
gradients for a set of a predetermined number of pixels 
from the second edge map, (g) obtaining a variance for 
each of the pixels included in each of the blocks based 
on the provided normalized directional gradients, (h) 
determining a feature point for each of the blocks based 
on the gradient magnitude and variance corresponding 
to each of the pixels therein. 
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Description 

Field of the Invention ^ ^ 

The present invention relates to a method and an apparatus for determining feature points; and, more particularly, 
to a method and an apparatus for determining feature points based on pixel intensity gradients and variances thereof. 

Description of the Prior Art 

As is well known, transmission of digitized video signals can attain video images of a much higher quality tiian the 
transmission of analog signals. When an image signal comprising a sequence of image "frames" is expressed in a dig- 
ital form, a sut)stantial amount of data is generated for transmission, especially in the case of a high definition television 
system. Since, however, the available frequency bandwictth of a conventional transmission channel is limited, in order 
to transmit the substantial amounts of digital data tiierethrough. it is inevitable to compress or reduce the volume of tine 
transmission data. Among various video conpression techniques, tiie so-called hybrid coding technique, which com- 
bines temporal and spatial compression techniques together witii a statistical coding technique, is known to be most 
effective. 

Most hybrid coding techniques employ a motion compensated DPCM(differential pulse coded modulation) which is 
a process of estimating the movement of an object between a current frame and its previous frame, and predicting tiie 
cun'ent frame according to the motion flow of tiie object to produce a differential signal representing the difference 
between the current frame and its prediction. This method is described, for example, in Staffan Ericsson. "Fixed and 
Adaptive Predictors for Hybrid Predictive/Transform Coding". IEEE Transactions on Communications, COM-33. No. 12 
(December 1985); and in Ninomiya and Ohtsuka. "A Motion-Compensated Interframe Coding Scheme for Television 
Pictures". IEEE Transactions on Communications. CQM-30. No. 1 (January 1982). 

In the nrration compensated DPCM, current frame data is predicted from the corresponding previous frame data 
based on an estimation of the motion between the current and the previous frames. Such estimated motion may be 
described in terms of two dimensional motion vectors representing the displacement of pixels between tiie previous and 
the current frames. 

There have been two basic approaches to estimate tiie displacement of pixels of an object: one is a block-by-block 
estimation and the other is a pixel-by-pixel approach. 

In the block-by-block motion estimation, a block in a current frame is compared with blocks in its previous frame until 
a best match is determined. From this, an interframe displacement vector (representing how much the block of pixels 
has moved between frames) for the whole block can be estimated for tiie current frame being transmitted. However, in 
the block-by-block motion estimation, poor estimates may result if all pixels in tiie block do not move in a same way. to 
thereby decrease the overall picture quality. 

Using a pixel-by-pixel approach, on the other hand, a displacement is determined for each and every pixel. This 
technique allows a more exact estimation of the pixel value and has the ability to easily handle scale changes (e.g., 
zooming, movement perpendicular to the image plane). However, in tiie pixel-by-pixel approach, since a motion vector 
is determined for each and every pixel, it is virtually impossible to transmit all of tiie motion vectors to a receiver 

One of the techniques inti-oduced to ameliorate tiie problem of dealing with tiie surplus or superfluous ti^nsmission 
data resulting from the pixel-by-pixel approach is a feature point-based motion estimation metiiod. In the feature point- 
based motion estimation technique, a set of selected pixels, i.e., feature points are determined at an encoder in a trans- 
mitting end and a decoder in a receiving end in an identical manner and nrration vectors for tiie feature points are ti-ans- 
mitted to tiie receiver witiiout bearing position data for tiiose feature points, wherein tiie feature points are defined as 
pixels of a previous frame or a current frame capable of representing motions of objects in a video signal so that motion 
vectors for all the pixels off tiie cunrent frame can be recovered or approximated from those of tiie feature F>oints in the 
receiver. In an encoder which adopts the motion estimation based on feature points, disclosed in a commonly owned 
copending application. U.S. Ser. No. 08^67, 520, entitled " Mettiod and Apparatus for Encoding a Video Signal Using 
Pixel-by-Pixel Motion Estimation", a number off feature points are first selected from all of the pixels contained in the pre- 
vious frame. Then, motion vectors for the selected feature points are determined, wherein each of the motion vectors 
represents a spatial displacement between one feature point in the previous frame and a corresponding matching point, 
i.e., a most similar pixel, in the current frame. Specifically, the matching point for each of the feature points is searched 
in a search region within tiie cun-ent frame, wherein tiie search region is defined as a region of a predetermined area 
which encompasses the position of tiie con-esponding feature point. In the feature point-based motion estimation tech- 
nique, since the current frame is predicted from the previous frame based on tiiose motion vectors for a set of feature 
points, it is inrportant to select tiie feature points capable off con-ectly representing the movement off the object. 

Typically, in an encoder and a decoder which adopt tiie motion estimation based on feature points, a number off fea- 
ture points are selected by using a grid technique or a combination of an edge detection technique and the grid tech- 
nique. 
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In the grid technique employing various types qf grid. e.g.. a rectangular or hexagonal grid, the nodes, i.e., grid 
points of the grid, are determined as the feature points, and in the combination of the edge detection technique and the 
grid technique, intersection ppirrts of tlie grid and the edge of the object are selected as the feature points. However, the 
grid points or the intersection points do not always correctly represent the movement of the object, resulting in a poor 
5 motion estimation of the object. 



Summary of the Invention 

It is, therefore, a primary object of the present invention to provide an improved method and apparatus for deter- 
10 mining feature points through the use of pixel intensity gradients and variances of those pixels on object boundaries. 

In accordance with the inverrtion, there is provided an apparatus, for use in a video signal processor which adopts 
a feature point based motion compensation technique, for determining feature points, said feature points being pixels 
capable of representing motions of objects in a video frame, comprising: means for providing directional gradients and 
a gradient magnitude for each pixel in the video frame; means for normalizing the directional gradients by dividing the 
75 directional gradients with the gradient magnitude; means for generating a first edge map having the gradient magnitude 
for each pixel; means for generating a second edge map having the normalized directional gradients for each pixel; 
means for dividing the first edge map into a plurality of blocks of an identical size, wherein the blocks do not overlap 
each other and each of the blocks includes a gradient magnitude corresponding to each of the pixels therein; means for 
providing, for each of the pixels included in each of the blocks, normalized directional gradients for a set of a predeter- 
20 mined number of pixels from the second edge map, wherein said set of pixels includes said each of the pixels; means 
for obtaining a variance for each of the pixels included in each of the blocks based on the provided normalized direc- 
tional gradients; and means for determining a feature point for each of the blocks based on the gradient rr^gnitude and 
variance corresponding to each of the pixels therein. 

26 Brief Description of the Drawings 

The above and other objects and features of the present invention will become apparent from the following descrip- 
tion of preferred embodiments given in conjunction with the accompanying drawings, in which: 

30 Fig. 1 depicts a block diagram of the inventive apparatus for determining feature points; 
Figs. 2A and 2B show a horizonat and a vertical sobel operators; 
Fig. 3 offers exemplary grid points generated by employing a rectangular grid: and 

Fig. 4 represents a diagram explaining the feature point determination scheme employed in the present invention. 

35 Detailed Description of the Preferred Embodiments 

Referring to Fig. 1 , there is illustrated an apparatus, for use in an encoder and a decoder which adopt a feature 
point based motion compensation technique, for determining feature points in accordance with the present invention, 
wherein the feature points are defined as pixels capable of representing motions of objects in a video signal. A digital 

40 video signal of a video frame, e.g.. a previous or a current frame, is fed to a gradient calculation t^ock 100. 

At the gradient calculation t>lock 100, pixel intensity gradients for all of the pixels in the video frame are calculated 
by using a gradient operator, e.g., a sobel operator. The sobel operator computes horizontal and vertical differences of 
local sunns, and has the desirable property of yielding zeros for uniform regions. In Figs. 2A and 28, a horizontal and a 
vertical sobel operators. sobeK^^ and sobeK^). are exemplarily illustrated, each boxed element irxJicating the location of 

45 the origin. The horizontal and tiie vertical sobel operators measure the gradient of an image l(x, y) in two orthogonal 
directions. Directional gradients, i.e.. horizontal and vertical gradients Gx(x. y) and Gy(x, y) at a pixel location (x, y), are 
defined as: 



1 1 



so 



1 1 

55 /=.1 

wherein hC^)(i, j) and h(y)(i, j) are sobel coefficients at (i, J) locations of the horizontal and vertical sobel operators, 
respectively. 
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A gradient magnitude g(x, y) at the pixel location (x. y) is then giv^ by 

9{^.y)=jG^(x,yf+Gy{x,yf ^ Eq.(2) 



or 9{x.yh\G^ix,y)\+\Gy(Ky)\ 

The gradient magnitude g(x, y) is applied to an edge detection block 200 for detecting edge points on object bound- 
aries, and the directional gradients Gx(x, y) and Gy(x, y) are applied to a normalization brfock 300 for the normalization 
thereof. 

The edge detection block 200 detects edge points in the video frame by comparing a gradient magnitude for each 
pixel in the video frame with a predetermined threshold value TH. That is, the pixel location (x, y) is an edge point if g(x, 
y) exceeds TH. 

Typically, the predetermined tiireshold value TH may be selected using tiie cumulative histogram of g{x. y) so tiiat 
5 to 1 0 % of pixels with largest gradient magnitudes are determined as edges. The locations of the detected edge points 
constitute a first edge map E(x. y), which is defined as: 



9(x,y) , (x,y)e { {x,y) ;g(x,y) ) TH ) 
0/ otherwise 



Eq. (3) 



That is, the edge map is formed by allocating gradient magnitudes to their respective edge points and "zeros" to non- 
edge poirrts. The edge map provides boundary information for tracing the object boundaries in the image, wherein the 
boundary information includes position data of the pixels in the video frame and gradient magnitudes corresponding to 
respective pixel positions. The boundary information produced by the edge detection block 200 is fed to a frame mem- 
ory 500 and stored therein as the first edge map. 

At the normalization block 300, the directional gradients Gx(x, y) and Gy(x, y) supplied from the gradient calculation 
block 100 are normalized as follows: 



U^{x,y) = 



G^(x,y) 



y/G^(x,y)^-^GyU, y)^ 



(x^y) e {{jc,y) ;gr(x,y)) 0 ) 



otherwise 



Uy{x,y) = 



G^ix,y) 



/ {x,y) € {(x,y) ;g(x,y)) 0 ) 



otherwise 



Eq. (3> 



wherein Ux(x. y) and Uy(x. y) represent the normalized horizontal and vertical gradients of the respective gradients 
Gx(x. y) and G/x, y) at a pixel location (x, y). The position data of the pixels and tfie normalized gradients Ux{x, y) and 
Uy{x, y) corresponding to respective pixel positions are provided to a frame memory 400 and stored therein as a second 
edge map. 

In the meantime, a grid point generation block 600 provides a plurality of grid points to an address generator 700. 
The grid points are pixel positions, e.g.. A to F, located at the nodes of a grid, e.g., a rectangular grid depicted in dotted 
lines as shown in Fig. 3, wherein each grid point is N pixels apart from its neighboring grid points in the horizontal and 
vertical directions, N being an even integer. The address generator 700 generates, for each grid point, a set of first 
address data which represents locations of (N+1) x (N+1) , e.g.. 9x9, pixels constituting a first processing block, the 
first processing block having the grid point at the center thereof: and generates (N+1) x {N+1) sets of second address 
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data, each set of the second address data represents locations of (2M+1) x (2M+1) , e.g., 11 x 1 1 , pixels (M being an 
odd integer) which form a second processing block, the second processing block Including each of the (N+1) x (N+1) 
pixels included the first processing block at the center thereof. The set of first address data and the sets of second 
address data for each grid point are fed to the frame memories 500 and 400, respectively. 

5 In response to the set of first address data for each grid point from the address generator 700. first edge map data 

corresponding the first processing block is retrieved from the frame memory 500 and provided to a variance calculation 
block 800, wherein the first edge map data represents position data of the {N+1) x (N+1) pixels included in the first 
processing block and gradient magnitudes corresponding to respective pixel positions. In the meantime, in response to 
each set of the second address data generated from the address generator 700, second edge map data corresponding 

10 to each of the (N+1 ) x (N+1 ) second processing blocks is retrieved from the frame memory 400 and fed to the variance , 
calculation block 800, wherein the second edge map data represents position data of the (2M+1) x(2M+1) pixels 
included in the second processing block and normalized directional gradients corresponding to those pixel positions. 

At the variance calculation block 800. a variance of the normalized directional gradients included in each of the 
(N+1) x (N+1 ) second processing blocks is calculated and set to a variance for a pixel at the center thereof. As is well 

15 known, a variance is a measure of deviation of sample values from their mean value, which implies that the greater the 
variance, the greater the degree of distribution of the gradients, i.e., the more complicated boundary configuration 
around the center pixel. 

A variance Var(x, y) at a position (x, y) may then be defined as: 

20 MM 

Var{x,y) 1— ^ ^ l(U^{x-t-hy-^jyU^(x.y)f HUy{x+i,y+j)-Uy{Ky))^] Eq.(5) 

(2yw+i) i^.M j^-M 




25 wherein U^(x+i. y+j) and Uy(x+i. y+j) are normalized horizontal and vertical gradients at pixel locations within a sec- 
ond processing block with a pixel location (x, y) at the center thereof. 

Ux(x, y) and ny(x, y) are average values of the normalized horizontal and vertical gradients included in the second 
processing block, which may be defined as: 

30 MM 

'^-^'''^^=';:::77:^^ S Eq.(6) 

(2/W+1) i^.Mh-M 



35 



(2/W+1) f^.f^ 



Thereafter, the variance calculation block 800 provides third edge map data for each first processing block to a first 

40 selector 300, wherein the third edge map data includes pixel position data of the (N+1) x (N+1) pixels within the first 
processing block and gradient magnitudes and calculated variances Var(x, y) con-esponding to respective pixel posi- 
tions included in the first processing block. 

The first selector 900 selects maximum P. e.g., 5, pixels in the order of variance magnitudes beginning from a larg- 
est one. wherein P being a predetermined number larger than 1 . Specifically, rf the first processing block includes P or 

45 more pixels having non-zero valued gradient magnitudes, P pixels are selected therefrom in a descending order of their 
variances; if less than P pixels having non-zero valued gradient magnitudes exist, all of those pixels are selected; and 
if all the pixels in the first processing block have zero valued gradient magnitudes, no pixel is selected. 

Referring to Fig. 4, there is illustrated a diagram explaining the feature point determination scheme employed In the 
present invention. Assuming a displacement of an object between two video frames is MV and two feature points FP1 

50 and FP2 are selected on the boundary of the object. Normally, a motion vector of a feature point is determined by using 
a block matching algorithm. That is, a motion vector for a search block of. e.g., 5x5 pixels having the feature point at 
the center thereof is determined by using the conventional block matching algorithm and the motion vector of the search 
tHock is assigned to a motion vector of the feature point. In such a case, since the feature point FP1 is situated on a 
rather complicated portion of the object bouridary. a matching point of the feature point FP1 can be uniquely determined 

55 at a real matching point FPV. On the other hand, the boundary corrfiguration around the feature point FP2 is relatively 
simple so that a matching point of the feature point FP2 may be assigned to a point, e.g.. FP2". FP2' or FF2"\ on a sim- 
ilar boundary configuration. Accordingly, the motion vector for the feature point FP1 having a larger variance of gradi- 
ents has more chance to reflect the real motion of the object than the feature point FP2 having a smaller variance. 
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Subsequently, the first selector 900 provides forth edge map data..to a second selector 1000, the forth edge map 
data including position data of the selected pixels and a gradient magnitude corresportding to each of the selected max- 
imum P pixels. i; 

The second selector 1000 compares the gradient magnitudes in the forth edge map data provided from the first 
5 selector 900 and selects a pixel having a largest gradient magnitude thereby setting the pixel as a selected feature 
point. An output from the second selector 1000 is a position data of the selected feature point. 

In accordance with the present invention, for each block which includes one or more pixels having non-zero valued 
gradient magnitudes, a pixel with the greatest magnitude is selected among pixels having largest variances in the block 
as a feature point of the block. As a result, each feature point is determined on a portion of the object boundaries having 
10 conplicated configuration, which is conducive to the better estimation of motion vectors for the feature points. 

Even though the preferred embodiments of the invention have been described with reference to the first processing 
trfocks of (N+1) x (N+1) pixels having a grid point at the center thereof, it should be apparerrt to those skilled in the art 
that the first processing block can be made to have x N2 pixels as long as a set of first processing blocks constitute 
the video frame. artd N2 being positive integers. 
IS While the present invention has been shown ar>d described with reference to the particular embodiments, it will be 
apparent to those skilled in the art that many changes and modifications may be made without departing from the spirit 
and scope of the invention as defined in the appended claims. 

Claims 

20 

1 . An apparatus, for use in a video signal processor which adopts a feature point based motion compensation tech- 
nique, for determining feature points, said feature points being pixels capable of representing motions of objects in 
a video frame, comprising: 

25 means or providing directional gradients and a gradient magnitude for each pixel in the video frame; 

means for normalizing the directional gradients by dividing the directional gradients with the gradient magni- 
tude; 

means for generating a first edge map having the gradient magnitude for each pixel; 
means for generating a second edge map having the normalized directional gradients for each pixel; 
30 means for dividing the first edge map into a plurality of blocks of an identical size, wherein the t>locks do not 

overlap each other and each of the blocks includes a gradient magnitude corresponding to each of the pixels 
therein; 

means for providing, for each of tiie pixels included in each of the blocks, normalized directional gradients for 
a set of a predetermined number of pixels from the second edge map. wherein said set of pixels includes said 
35 each of the pixels; 

means for obtaining a variance for each of the pixels included in each of the blocks based on the provided nor- 
malized directional gradients; and 

means for determining a feature point for each of the bilocks based on the gradient magnitude and variance cor- 
responding to each of the pixels tiierein. 

40 

2. The apparatus according to daim 1, wherein said determination means includes: 

means for selecting maximum P pixels, for each of the blocks, in the order of the variances tiiereof beginning 
from a largest one. P being a predetermined number larger than 1 , such that if P or more pixels having non- 
45 zero valued gradient magnitudes are included in each of tiie blocks, P pixels are selected in a descending order 

of their variances, if fewer than P number of pixels having non-zero valued gradient magnitudes exist, all of 
those pixels are selected, and if all the pixels in each of tiie blocks have the zero-valued gradient magnitude, 
no pixel is selected; and 

means for determining a pixel having a largest gradient magnitude among the selected maximum pixels as the 
so feature point for each of the blocks. 

3. A method, for use in a video signal processor which adopts a feature point based motion compensation technique, 
for determining feature points, said feature points being pixels capable of representing motions of objects in a video 
frame, comprising the steps of: 

55 

(a) providing directional gradients and a gradient magnitude for each pixel in the video frame; 

(b) normalizing the directional gradients by dividing the directional gradients with the gradient magnitude; 

(c) generating a first edge map having tiie gradient magnitude for each pixel; 

(d) generating a second edge map having the normalized direction gradients for each pixel; 
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(e) dividing the first edge map into a plurality of bloc|<5 of an identical size, wherein the blocks do not overlap 
each other and each of the blocks includes a gradient magnitude corresponding to each of the pixels therein; 

(f) providing, for each of the pixels included in each of the blocks, normalized directional gradients for a set of 
a predetermined number of pixels from the second edge map. wherein said set of pixels includes said each of 

5 the pixels: 

(g) obtaining a variance for each of the pixels included in each of the blocks based on the provided normalized 
directional gradients; 

(h) determining a feature point for each of the blocks based on the gradient magnitude and variance corre- 
sponding to each of the pixels therein. 

70 

4. The method in accordance with claim 3, wherein said step(h) includes the steps of: 

(h1) selecting maximum P pixels, for each of the blocks, in the order of the variances thereof beginning from a 
largest one, P being a predetermined number larger than 1 . such that if P or more pixels having non-zero val- 
15 ued gradient magnitudes are included in each of the blocks. P pixels are selected in a descending order of their 

variances, if fewer than P number of pixels having non-zero valued gradient magnitudes exist, all of those pix- 
els are selected, and if all the pixels in each of the blocks have the zero-valued gradient magnitude, no pixel is 
selected; and 

(h2) determining a pixel having a largest gradient magnitude among the selected maximum pixels as the fea- 
20 ture point for each of the blocks. 
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